在许多现实世界的背景下,成功的人类合作要求人类有效地将补充信息来源整合到AI信息的决策中。但是,实际上,人类决策者常常缺乏对AI模型与自己有关的信息的了解。关于如何有效沟通不可观察的指南,几乎没有可用的准则:可能影响结果但模型无法使用的功能。在这项工作中,我们进行了一项在线实验,以了解以及如何显式交流潜在相关的不可观念,从而影响人们在做出预测时如何整合模型输出和无法观察到的。我们的发现表明,提示有关不可观察的提示可以改变人类整合模型输出和不可观察的方式,但不一定会改善性能。此外,这些提示的影响可能会根据决策者的先前领域专业知识而有所不同。我们通过讨论对基于AI的决策支持工具的未来研究和设计的影响来结束。
translated by 谷歌翻译
生成的,ML驱动的交互式系统有可能改变人们在创作过程中与计算机互动的方式 - 将工具变成共同创建者。但是,目前尚不清楚我们如何在开放式任务域中实现有效的人类协作。在与ML驱动系统的交互中,沟通涉及一些已知的挑战。共同创造系统设计的一个被忽视的方面是如何在学习与此类系统协作时更好地支持用户。在这里,我们将人类合作的合作重新定为一个学习问题:受团队学习研究的启发,我们假设适用于人类人类团队的类似学习策略也可能会提高与共同创造生成系统一起工作的人类的协作效率和质量。在该职位论文中,我们旨在促进团队学习,作为设计更有效的共同创造人类协作的镜头,并强调协作过程质量作为共同创造系统的目标。此外,我们概述了将团队学习支持嵌入共同创造的AI系统中的初步示意图。最后,我们提出了研究议程,并提出了开放问题,以进一步研究支持人们学习与生成AI系统合作。
translated by 谷歌翻译
混合人类ML系统越来越多地负责各种领域的结果决策。越来越多的经验和理论工作已经提出了我们对这些系统的理解。但是,现有的经验结果混合在一起,理论建议通常是互不兼容的。在这项工作中,我们提出了一个理解条件的统一框架,在该框架下,将人类和ML的互补优势结合起来会导致比单独单独产生的决策更高的质量决策 - 我们称之为人类ML互补性。我们专门关注人类ML预测性决策的背景,并研究结合人类和ML预测性决策的最佳方法,这是其判断中基本变化来源的理解。在此范围内,我们提出了两个至关重要的贡献。首先,从心理学,机器学习和人类计算机互动中的先前文献进行决策和借鉴的计算观点,我们引入了一种分类学,描述了人类和机器决策不同的广泛标准。其次,将我们的分类法进行正式化,使我们能够研究人类和ML预测性决策应如何最佳地汇总。我们表明,我们提出的框架包括一些现有的人类ML互补模型作为特殊情况。最后但并非最不重要的一点是,对我们框架的初步探索性分析为未来在人类ML互补性方面的工作提供了关键的见解:我们结合人类和ML判断的机制应由其决策中的基本原因来告知。
translated by 谷歌翻译
可解释的AI(XAI)是支持高赌注视觉检测任务的人AI合作的承诺手段,例如来自卫星成像仪的损坏检测任务,作为完全自动化的方法不太可能是完全安全可靠的。然而,大多数现有的XAI技术都没有通过对人类的特定任务特定需求进行解释来了解。因此,我们迈向了解Xai人类在损坏检测任务中需要什么迈出的第一步。我们在在线众包的研究中了解人们如何在评估基于卫星图像的建筑损坏的严重程度时解释自己的评估。通过与60人群的研究,我们介绍了六种主要策略,即人类利用解释他们的视觉伤害评估。我们对我们的调查结果提出了对这种视觉检测环境的设计设计的影响,并讨论了未来研究的机会。
translated by 谷歌翻译
Non-parametric tests can determine the better of two stochastic optimization algorithms when benchmarking results are ordinal, like the final fitness values of multiple trials. For many benchmarks, however, a trial can also terminate once it reaches a pre-specified target value. When only some trials reach the target value, two variables characterize a trial's outcome: the time it takes to reach the target value (or not) and its final fitness value. This paper describes a simple way to impose linear order on this two-variable trial data set so that traditional non-parametric methods can determine the better algorithm when neither dominates. We illustrate the method with the Mann-Whitney U-test. A simulation demonstrates that U-scores are much more effective than dominance when tasked with identifying the better of two algorithms. We test U-scores by having them determine the winners of the CEC 2022 Special Session and Competition on Real-Parameter Numerical Optimization.
translated by 谷歌翻译
We extend best-subset selection to linear Multi-Task Learning (MTL), where a set of linear models are jointly trained on a collection of datasets (``tasks''). Allowing the regression coefficients of tasks to have different sparsity patterns (i.e., different supports), we propose a modeling framework for MTL that encourages models to share information across tasks, for a given covariate, through separately 1) shrinking the coefficient supports together, and/or 2) shrinking the coefficient values together. This allows models to borrow strength during variable selection even when the coefficient values differ markedly between tasks. We express our modeling framework as a Mixed-Integer Program, and propose efficient and scalable algorithms based on block coordinate descent and combinatorial local search. We show our estimator achieves statistically optimal prediction rates. Importantly, our theory characterizes how our estimator leverages the shared support information across tasks to achieve better variable selection performance. We evaluate the performance of our method in simulations and two biology applications. Our proposed approaches outperform other sparse MTL methods in variable selection and prediction accuracy. Interestingly, penalties that shrink the supports together often outperform penalties that shrink the coefficient values together. We will release an R package implementing our methods.
translated by 谷歌翻译
Weakly-supervised learning (WSL) has been proposed to alleviate the conflict between data annotation cost and model performance through employing sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown promising performance, particularly in the image segmentation field. However, it is still a very challenging problem due to the limited supervision, especially when only a small number of labeled samples are available. Additionally, almost all existing WSL segmentation methods are designed for star-convex structures which are very different from curvilinear structures such as vessels and nerves. In this paper, we propose a novel sparsely annotated segmentation framework for curvilinear structures, named YoloCurvSeg, based on image synthesis. A background generator delivers image backgrounds that closely match real distributions through inpainting dilated skeletons. The extracted backgrounds are then combined with randomly emulated curves generated by a Space Colonization Algorithm-based foreground generator and through a multilayer patch-wise contrastive learning synthesizer. In this way, a synthetic dataset with both images and curve segmentation labels is obtained, at the cost of only one or a few noisy skeleton annotations. Finally, a segmenter is trained with the generated dataset and possibly an unlabeled dataset. The proposed YoloCurvSeg is evaluated on four publicly available datasets (OCTA500, CORN, DRIVE and CHASEDB1) and the results show that YoloCurvSeg outperforms state-of-the-art WSL segmentation methods by large margins. With only one noisy skeleton annotation (respectively 0.14%, 0.02%, 1.4%, and 0.65% of the full annotation), YoloCurvSeg achieves more than 97% of the fully-supervised performance on each dataset. Code and datasets will be released at https://github.com/llmir/YoloCurvSeg.
translated by 谷歌翻译
To build general robotic agents that can operate in many environments, it is often imperative for the robot to collect experience in the real world. However, this is often not feasible due to safety, time, and hardware restrictions. We thus propose leveraging the next best thing as real-world experience: internet videos of humans using their hands. Visual priors, such as visual features, are often learned from videos, but we believe that more information from videos can be utilized as a stronger prior. We build a learning algorithm, VideoDex, that leverages visual, action, and physical priors from human video datasets to guide robot behavior. These actions and physical priors in the neural network dictate the typical human behavior for a particular robot task. We test our approach on a robot arm and dexterous hand-based system and show strong results on various manipulation tasks, outperforming various state-of-the-art methods. Videos at https://video-dex.github.io
translated by 谷歌翻译
We propose AstroSLAM, a standalone vision-based solution for autonomous online navigation around an unknown target small celestial body. AstroSLAM is predicated on the formulation of the SLAM problem as an incrementally growing factor graph, facilitated by the use of the GTSAM library and the iSAM2 engine. By combining sensor fusion with orbital motion priors, we achieve improved performance over a baseline SLAM solution. We incorporate orbital motion constraints into the factor graph by devising a novel relative dynamics factor, which links the relative pose of the spacecraft to the problem of predicting trajectories stemming from the motion of the spacecraft in the vicinity of the small body. We demonstrate the excellent performance of AstroSLAM using both real legacy mission imagery and trajectory data courtesy of NASA's Planetary Data System, as well as real in-lab imagery data generated on a 3 degree-of-freedom spacecraft simulator test-bed.
translated by 谷歌翻译
Commonsense knowledge-graphs (CKGs) are important resources towards building machines that can 'reason' on text or environmental inputs and make inferences beyond perception. While current CKGs encode world knowledge for a large number of concepts and have been effectively utilized for incorporating commonsense in neural models, they primarily encode declarative or single-condition inferential knowledge and assume all conceptual beliefs to have the same likelihood. Further, these CKGs utilize a limited set of relations shared across concepts and lack a coherent knowledge organization structure resulting in redundancies as well as sparsity across the larger knowledge graph. Consequently, today's CKGs, while useful for a first level of reasoning, do not adequately capture deeper human-level commonsense inferences which can be more nuanced and influenced by multiple contextual or situational factors. Accordingly, in this work, we study how commonsense knowledge can be better represented by -- (i) utilizing a probabilistic logic representation scheme to model composite inferential knowledge and represent conceptual beliefs with varying likelihoods, and (ii) incorporating a hierarchical conceptual ontology to identify salient concept-relevant relations and organize beliefs at different conceptual levels. Our resulting knowledge representation framework can encode a wider variety of world knowledge and represent beliefs flexibly using grounded concepts as well as free-text phrases. As a result, the framework can be utilized as both a traditional free-text knowledge graph and a grounded logic-based inference system more suitable for neuro-symbolic applications. We describe how we extend the PrimeNet knowledge base with our framework through crowd-sourcing and expert-annotation, and demonstrate its application for more interpretable passage-based semantic parsing and question answering.
translated by 谷歌翻译